2,054 research outputs found

    A Distributed and Accountable Approach to Offline Recommender Systems Evaluation

    Get PDF
    Different software tools have been developed with the purpose of performing offline evaluations of recommender systems. However, the results obtained with these tools may be not directly comparable because of subtle differences in the experimental protocols and metrics. Furthermore, it is difficult to analyze in the same experimental conditions several algorithms without disclosing their implementation details. For these reasons, we introduce RecLab, an open source software for evaluating recommender systems in a distributed fashion. By relying on consolidated web protocols, we created RESTful APIs for training and querying recommenders remotely. In this way, it is possible to easily integrate into the same toolkit algorithms realized with different technologies. In details, the experimenter can perform an evaluation by simply visiting a web interface provided by RecLab. The framework will then interact with all the selected recommenders and it will compute and display a comprehensive set of measures, each representing a different metric. The results of all experiments are permanently stored and publicly available in order to support accountability and comparative analyses.Comment: REVEAL 2018 Workshop on Offline Evaluation for Recommender System

    Knowledge extraction from unstructured data and classification through distributed ontologies

    Get PDF
    The World Wide Web has changed the way humans use and share any kind of information. The Web removed several access barriers to the information published and has became an enormous space where users can easily navigate through heterogeneous resources (such as linked documents) and can easily edit, modify, or produce them. Documents implicitly enclose information and relationships among them which become only accessible to human beings. Indeed, the Web of documents evolved towards a space of data silos, linked each other only through untyped references (such as hypertext references) where only humans were able to understand. A growing desire to programmatically access to pieces of data implicitly enclosed in documents has characterized the last efforts of the Web research community. Direct access means structured data, thus enabling computing machinery to easily exploit the linking of different data sources. It has became crucial for the Web community to provide a technology stack for easing data integration at large scale, first structuring the data using standard ontologies and afterwards linking them to external data. Ontologies became the best practices to define axioms and relationships among classes and the Resource Description Framework (RDF) became the basic data model chosen to represent the ontology instances (i.e. an instance is a value of an axiom, class or attribute). Data becomes the new oil, in particular, extracting information from semi-structured textual documents on the Web is key to realize the Linked Data vision. In the literature these problems have been addressed with several proposals and standards, that mainly focus on technologies to access the data and on formats to represent the semantics of the data and their relationships. With the increasing of the volume of interconnected and serialized RDF data, RDF repositories may suffer from data overloading and may become a single point of failure for the overall Linked Data vision. One of the goals of this dissertation is to propose a thorough approach to manage the large scale RDF repositories, and to distribute them in a redundant and reliable peer-to-peer RDF architecture. The architecture consists of a logic to distribute and mine the knowledge and of a set of physical peer nodes organized in a ring topology based on a Distributed Hash Table (DHT). Each node shares the same logic and provides an entry point that enables clients to query the knowledge base using atomic, disjunctive and conjunctive SPARQL queries. The consistency of the results is increased using data redundancy algorithm that replicates each RDF triple in multiple nodes so that, in the case of peer failure, other peers can retrieve the data needed to resolve the queries. Additionally, a distributed load balancing algorithm is used to maintain a uniform distribution of the data among the participating peers by dynamically changing the key space assigned to each node in the DHT. Recently, the process of data structuring has gained more and more attention when applied to the large volume of text information spread on the Web, such as legacy data, news papers, scientific papers or (micro-)blog posts. This process mainly consists in three steps: \emph{i)} the extraction from the text of atomic pieces of information, called named entities; \emph{ii)} the classification of these pieces of information through ontologies; \emph{iii)} the disambigation of them through Uniform Resource Identifiers (URIs) identifying real world objects. As a step towards interconnecting the web to real world objects via named entities, different techniques have been proposed. The second objective of this work is to propose a comparison of these approaches in order to highlight strengths and weaknesses in different scenarios such as scientific and news papers, or user generated contents. We created the Named Entity Recognition and Disambiguation (NERD) web framework, publicly accessible on the Web (through REST API and web User Interface), which unifies several named entity extraction technologies. Moreover, we proposed the NERD ontology, a reference ontology for comparing the results of these technologies. Recently, the NERD ontology has been included in the NIF (Natural language processing Interchange Format) specification, part of the Creating Knowledge out of Interlinked Data (LOD2) project. Summarizing, this dissertation defines a framework for the extraction of knowledge from unstructured data and its classification via distributed ontologies. A detailed study of the Semantic Web and knowledge extraction fields is proposed to define the issues taken under investigation in this work. Then, it proposes an architecture to tackle the single point of failure issue introduced by the RDF repositories spread within the Web. Although the use of ontologies enables a Web where data is structured and comprehensible by computing machinery, human users may take advantage of it especially for the annotation task. Hence, this work describes an annotation tool for web editing, audio and video annotation in a web front end User Interface powered on the top of a distributed ontology. Furthermore, this dissertation details a thorough comparison of the state of the art of named entity technologies. The NERD framework is presented as technology to encompass existing solutions in the named entity extraction field and the NERD ontology is presented as reference ontology in the field. Finally, this work highlights three use cases with the purpose to reduce the amount of data silos spread within the Web: a Linked Data approach to augment the automatic classification task in a Systematic Literature Review, an application to lift educational data stored in Sharable Content Object Reference Model (SCORM) data silos to the Web of data and a scientific conference venue enhancer plug on the top of several data live collectors. Significant research efforts have been devoted to combine the efficiency of a reliable data structure and the importance of data extraction techniques. This dissertation opens different research doors which mainly join two different research communities: the Semantic Web and the Natural Language Processing community. The Web provides a considerable amount of data where NLP techniques may shed the light within it. The use of the URI as a unique identifier may provide one milestone for the materialization of entities lifted from a raw text to real world object

    Sequeval: A Framework to Assess and Benchmark Sequence-based Recommender Systems

    Get PDF
    In this paper, we present sequeval, a software tool capable of performing the offline evaluation of a recommender system designed to suggest a sequence of items. A sequence-based recommender is trained considering the sequences already available in the system and its purpose is to generate a personalized sequence starting from an initial seed. This tool automatically evaluates the sequence-based recommender considering a comprehensive set of eight different metrics adapted to the sequential scenario. sequeval has been developed following the best practices of software extensibility. For this reason, it is possible to easily integrate and evaluate novel recommendation techniques. sequeval is publicly available as an open source tool and it aims to become a focal point for the community to assess sequence-based recommender systems.Comment: REVEAL 2018 Workshop on Offline Evaluation for Recommender System

    A Semantic Web Annotation Tool for a Web-Based Audio Sequencer

    Get PDF
    Music and sound have a rich semantic structure which is so clear to the composer and the listener, but that remains mostly hidden to computing machinery. Nevertheless, in recent years, the introduction of software tools for music production have enabled new opportunities for migrating this knowledge from humans to machines. A new generation of these tools may exploit sound samples and semantic information coupling for the creation not only of a musical, but also of a "semantic" composition. In this paper we describe an ontology driven content annotation framework for a web-based audio editing tool. In a supervised approach, during the editing process, the graphical web interface allows the user to annotate any part of the composition with concepts from publicly available ontologies. As a test case, we developed a collaborative web-based audio sequencer that provides users with the functionality to remix the audio samples from the Freesound website and subsequently annotate them. The annotation tool can load any ontology and thus gives users the opportunity to augment the work with annotations on the structure of the composition, the musical materials, and the creator's reasoning and intentions. We believe this approach will provide several novel ways to make not only the final audio product, but also the creative process, first class citizens of the Semantic We

    Use of biochar geostructures for urban stormwater water cleanup

    Get PDF
    Introduction Stormwater runoff from urban catchment areas is a leading contributor to water quality pollution which can result in limitations on urban development. Engineering systems used for the treatment of stormwater runoff, use in most cases, non-renewable resources. Biochar or charcoal is a renewable resource and is being investigated as a filtration media for stormwater cleanup. Background Currently engineering systems are available to control the volume of runoff after a storm event from urban catchments and influence the runoff water quality. In these engineered systems the water is not only slowed down, but also, physical, chemical and microbial processes are utilized for the removal of unwanted contaminants. An organic medium being researched for the use of stormwater cleanup is Biochar. Biochar is a form of charcoal produced through the thermochemical conversion of organic materials or biomass. The biomass remaining after pyrolysis is a fine-grained, highly porous material which gives the material large amounts of surface area resulting in a highly adsorbent material. Methodology The use of Biochar for improving stormwater water quality has been growing worldwide with product developers and researchers working to prove, advance science and markets of this emerging material. This thesis has been compiled using research material collated from various sources which provides insight into the use of Biochar geostructures for urban stormwater cleanup. Collectively, the material contained within this thesis represents research already undertaken by other parties; however it will provide information on emerging technologies using biochar. Key Outcomes Initial trials using biochar as a medium for improving stormwater quality for urban runoff has provided positive results. Additional research is required to determine cost effective, easy maintainable and to monitor performance versus economic considerations for the use of biochar geostructures. Research using enzyme additives to improve biochar performance is emerging. Further Work The next stage is the use of biochar as a medium for different geostructures for urban stormwater water cleanup and record the results of the reduction of heavy metals, herbicides and organics in stormwater. Conclusions The use of Biochar for improving stormwater water quality in urban catchments is in its infancy for practical testing. The different biomass used to create Biohar has an effect on its performance for improving stormwater runoff quality. Research is continuing to evolve to determine whether enzymes can be used to improve the performance of Biochar

    Creating enriched YouTube media fragments With NERD using timed-text

    No full text
    This demo enables the automatic creation of semantically annotated YouTube media fragments. A video is first ingested in the Synote system and a new method enables to retrieve its associated subtitles or closed captions. Next, NERD is used to extract named entities from the transcripts which are then temporally aligned with the video. The entities are disambiguated in the LOD cloud and a user interface enables to browse through the entities detected in a video or get more information. We evaluated our application with 60 videos from 3 YouTube channels

    INTERFACE MODEL FOR THE NONLINEAR ANALYSIS OF BLOCKY STRUCTURES OF ANCIENT GREEK TEMPLES

    Get PDF
    The presence of singularity surfaces with reference to the displacement field is a characteristic of a number of structural systems. Strong discontinuities are present in old masonry structures where dry joints connect the blocks or the mortar ageing suggests to neglect the adhesion properties. These structures cannot be considered a continuum but rather an assembly of blocks. These discontinuous structures could be modelled as an assembly of blocks interacting trough frictional joints whose mechanical behaviour is described by appropriate interface laws. In the present work an interface model present in literature is adopted, the double asperity model, which has been implemented in a standard finite element code with the principal aim to develop structural analysis of old monumental masonry structures. The interface model is briefly illustrated and the numerical implementation of the interface laws is described in detail. Numerical examples are presented to simulate the behaviour of a couple of greek temples of Agrigento Italy. These old monumental structures, IV-VI sec. BC, are inserted in the world heritage list by Unesco

    Fast training of self organizing maps for the visual exploration of molecular compounds

    Get PDF
    Visual exploration of scientific data in life science area is a growing research field due to the large amount of available data. The Kohonenā€™s Self Organizing Map (SOM) is a widely used tool for visualization of multidimensional data. In this paper we present a fast learning algorithm for SOMs that uses a simulated annealing method to adapt the learning parameters. The algorithm has been adopted in a data analysis framework for the generation of similarity maps. Such maps provide an effective tool for the visual exploration of large and multi-dimensional input spaces. The approach has been applied to data generated during the High Throughput Screening of molecular compounds; the generated maps allow a visual exploration of molecules with similar topological properties. The experimental analysis on real world data from the National Cancer Institute shows the speed up of the proposed SOM training process in comparison to a traditional approach. The resulting visual landscape groups molecules with similar chemical properties in densely connected regions
    • ā€¦
    corecore